BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data
نویسندگان
چکیده
UNLABELLED Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity. AVAILABILITY AND IMPLEMENTATION BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools CONTACT [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
A new approach to wind turbine power generation forecasting, using weather radar data based on Hidden Markov Model
The wind is one of the most important and affecting phenomena and is known as one of the significant clean resources of energy. Apart from other atmospheric parameters, the wind has complex behavior and intermittent characteristics. Local phenomena can be accompanied by the wind, which is strong, non-predicted, and damaging. Weather radars are capable of detecting and displaying storm-related ...
متن کاملRun of Homozygosity a Procedure to Detecting Inbreeding in Farm Animals
Inbreeding depression is a harmful phenomenon in livestock which is outcome of inbreeding. Inbreeding is consequence mating between two individuals who are more related to each other than average relatedness in population, which results in reducing in fitness of progenies and genetic variability in populations. Development of high-density genome-wide single nucleotide polymorphism (SNP) array f...
متن کاملA dynamic Bayesian Markov model for phasing and characterizing haplotypes in next-generation sequencing
MOTIVATION Next-generation sequencing (NGS) technologies have enabled whole-genome discovery and analysis of genetic variants in many species of interest. Individuals are often sequenced at low coverage for detecting novel variants, phasing haplotypes and inferring population structures. Although several tools have been developed for SNP and genotype calling in NGS data, haplotype phasing is of...
متن کاملHM: detection of runs of homozygosity from whole-exome sequencing data
Motivation: Runs of homozygosity (ROH) are sizable chromosomal stretches of homozygous genotypes, ranging in length from tens of kilobases to megabases. ROHs can be relevant for population and medical genetics, playing a role in predisposition to both rare and common disorders. ROHs are commonly detected by single nucleotide polymorphism (SNP) microarrays, but attempts have been made to use who...
متن کاملAbnormality Detection in a Landing Operation Using Hidden Markov Model
The air transport industry is seeking to manage risks in air travels. Its main objective is to detect abnormal behaviors in various flight conditions. The current methods have some limitations and are based on studying the risks and measuring the effective parameters. These parameters do not remove the dependency of a flight process on the time and human decisions. In this paper, we used an HMM...
متن کامل